Add ARJ, ARC, and ACE archive decompression support by Copilot · Pull Request #194 · microsoft/RecursiveExtractor

Copilot · 2026-02-06T18:11:43Z

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

… clone requirement Co-authored-by: gfs <98900+gfs@users.noreply.github.com>

Co-authored-by: gfs <98900+gfs@users.noreply.github.com>

gfs · 2026-02-06T18:41:29Z

/azp run

azure-pipelines · 2026-02-06T18:41:44Z

Azure Pipelines successfully started running 1 pipeline(s).

Copilot

Pull request overview

Adds native extraction support for ARJ, ARC, and ACE archives (via SharpCompress) to RecursiveExtractor, including MiniMagic type detection and test coverage to validate detection + extraction output counts.

Changes:

Upgraded SharpCompress to enable ARJ/ARC/ACE reader support.
Added MiniMagic signatures and new ArchiveFileType enum values for ARJ/ARC/ACE, plus registered new extractors.
Added small fixture archives and expanded unit tests to cover detection and extracted-file counts.

Reviewed changes

Copilot reviewed 10 out of 13 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
RecursiveExtractor/RecursiveExtractor.csproj	Bumps SharpCompress dependency to a version that supports ARJ/ARC/ACE.
RecursiveExtractor/MiniMagic.cs	Adds ARJ/ARC/ACE types and signature detection; increases header read size.
RecursiveExtractor/Extractors/ArjExtractor.cs	New ARJ extractor implementation (sync + async).
RecursiveExtractor/Extractors/ArcExtractor.cs	New ARC extractor implementation (sync + async), with post-decompression size check.
RecursiveExtractor/Extractors/AceExtractor.cs	New ACE extractor implementation (sync + async).
RecursiveExtractor/Extractor.cs	Registers ARJ/ARC/ACE extractors in `SetDefaultExtractors()`.
RecursiveExtractor.Tests/TestData/TestDataArchives/TestData.arj	Adds ARJ test fixture archive.
RecursiveExtractor.Tests/TestData/TestDataArchives/TestData.arc	Adds ARC test fixture archive.
RecursiveExtractor.Tests/TestData/TestDataArchives/TestData.ace	Adds ACE test fixture archive.
RecursiveExtractor.Tests/RecursiveExtractor.Tests.csproj	Ensures new fixture archives are copied to test output.
RecursiveExtractor.Tests/ExtractorTests/MiniMagicTests.cs	Adds MiniMagic detection tests for ARJ/ARC/ACE fixtures.
RecursiveExtractor.Tests/ExtractorTests/ExpectedNumFilesTests.cs	Adds extraction count expectations for ARJ/ARC/ACE fixtures.
.github/copilot-instructions.md	Documents deep-clone requirement for Nerdbank.GitVersioning builds.

RecursiveExtractor/MiniMagic.cs

gfs · 2026-02-06T18:56:33Z

@copilot it doesn't intuitive eem like it should be related - but with this change i'm seeing the tests for TestDataEncrypted.7z fail with 0 results found on both the lib and cli tests. Is there perhaps an issue with the changes to minimagic that are causing an incorrect header detection on that file is one theory i have.

…e each signature check on bytes actually read Co-authored-by: gfs <98900+gfs@users.noreply.github.com>

SharpCompress 0.44.5 now throws CryptographicException at Open() time for encrypted 7z archives, leaving the stream at a non-zero position. Subsequent password retry calls to Open() then fail with EndOfStreamException. Reset fileEntry.Content.Position = 0 before each retry, matching the pattern already used by RarExtractor and ZipExtractor. Co-authored-by: gfs <98900+gfs@users.noreply.github.com>

Copilot · 2026-02-06T19:04:02Z

@copilot it doesn't intuitive eem like it should be related - but with this change i'm seeing the tests for TestDataEncrypted.7z fail with 0 results found on both the lib and cli tes...

Good catch — it's not MiniMagic but a side effect of the SharpCompress 0.44.5 upgrade. In the old version, SevenZipArchive.Open() on encrypted archives succeeded silently and detection happened later via .IsEncrypted. In 0.44.5, Open() now throws CryptographicException immediately, which leaves the stream at a non-zero position (e.g. 1775 bytes in). The subsequent password-retry Open() calls then fail with EndOfStreamException because the stream isn't rewound.

Fixed in 9444f82 by adding fileEntry.Content.Position = 0 before each retry — matching the pattern already used by RarExtractor and ZipExtractor.

gfs · 2026-02-06T19:04:48Z

/azp run

azure-pipelines · 2026-02-06T19:05:01Z

Azure Pipelines successfully started running 1 pipeline(s).

gfs · 2026-02-06T21:22:08Z

@copilot something I wonder about these new test samples for these formats - the fallback case of extact self on fail also would result in 1 file being extracted - I think maybe our expected num files tests should check to make sure that the status isn't set to failed archive. What do you think?

Three test methods (ExtractArchiveParallel, ExtractArchiveFromStreamAsync, ExtractArchiveFromStream) only verified result count, which could mask extraction failures for single-file archives (ARJ, ARC, ACE) since ExtractSelfOnFail returns 1 result with FailedArchive status. Now all three also assert that no result has FailedArchive status. Co-authored-by: gfs <98900+gfs@users.noreply.github.com>

Copilot · 2026-02-06T21:28:38Z

@copilot something I wonder about these new test samples for these formats - the fallback case of extact self on fail also would result in 1 file being extracted - I think maybe our ex...

Great point — with ExtractSelfOnFail=true and 1-file archives, a broken extractor still returns 1 result, so the count assertion alone can't distinguish success from failure. Fixed in ac4537f by adding FailedArchive status assertions to the three test methods that previously only checked count: ExtractArchiveParallel, ExtractArchiveFromStreamAsync, and ExtractArchiveFromStream. The other methods (ExtractArchive, ExtractArchiveAsync, ExtractArchiveNoRecursion, ExtractArchiveParallelNoRecursion, ExtractArchiveSmallBatchSize) already had status verification.

gfs · 2026-02-06T21:34:28Z

/azp run

azure-pipelines · 2026-02-06T21:34:40Z

Azure Pipelines successfully started running 1 pipeline(s).

Copilot

Pull request overview

Copilot reviewed 11 out of 14 changed files in this pull request and generated 6 comments.

Comments suppressed due to low confidence (5)

RecursiveExtractor/Extractors/AceExtractor.cs:136

The entry stream from aceReader.OpenEntryStream() is not disposed. Since FileEntry copies the stream when passthroughStream is false, dispose the entry stream immediately after constructing the FileEntry to avoid holding reader resources longer than needed.

                        try
                        {
                            governor.CheckResourceGovernor(entry.Size);
                            var stream = aceReader.OpenEntryStream();
                            var name = entry.Key?.Replace('/', Path.DirectorySeparatorChar);
                            if (string.IsNullOrEmpty(name))
                            {
                                Logger.Debug(Extractor.ENTRY_MISSING_NAME_ERROR_MESSAGE_STRING, ArchiveFileType.ACE, fileEntry.FullPath);
                                continue;
                            }
                            newFileEntry = new FileEntry(name, stream, fileEntry, false, entry.CreatedTime, entry.LastModifiedTime, entry.LastAccessedTime, memoryStreamCutoff: options.MemoryStreamCutoff);
                        }

RecursiveExtractor/Extractors/ArjExtractor.cs:136

The entry stream from arjReader.OpenEntryStream() is not disposed. Since FileEntry copies the stream when passthroughStream is false, the source stream can be disposed immediately after constructing the FileEntry to avoid holding resources unnecessarily.

                        {
                            var stream = arjReader.OpenEntryStream();
                            var name = entry.Key?.Replace('/', Path.DirectorySeparatorChar);
                            if (string.IsNullOrEmpty(name))
                            {
                                Logger.Debug(Extractor.ENTRY_MISSING_NAME_ERROR_MESSAGE_STRING, ArchiveFileType.ARJ, fileEntry.FullPath);
                                continue;
                            }
                            newFileEntry = new FileEntry(name, stream, fileEntry, false, entry.CreatedTime, entry.LastModifiedTime, entry.LastAccessedTime, memoryStreamCutoff: options.MemoryStreamCutoff);
                        }

RecursiveExtractor/Extractors/ArcExtractor.cs:138

The entry stream returned by arcReader.OpenEntryStream() is never disposed. Because FileEntry copies the contents (passthrough=false), it’s safe to dispose the entry stream immediately after constructing new FileEntry(...).

                        try
                        {
                            var stream = arcReader.OpenEntryStream();
                            var name = entry.Key?.Replace('/', Path.DirectorySeparatorChar);
                            if (string.IsNullOrEmpty(name))
                            {
                                Logger.Debug(Extractor.ENTRY_MISSING_NAME_ERROR_MESSAGE_STRING, ArchiveFileType.ARC, fileEntry.FullPath);
                                continue;
                            }
                            newFileEntry = new FileEntry(name, stream, fileEntry, false, entry.CreatedTime, entry.LastModifiedTime, entry.LastAccessedTime, memoryStreamCutoff: options.MemoryStreamCutoff);
                        }

RecursiveExtractor/MiniMagic.cs:191

These 'if' statements can be combined.

            if (bytesRead >= 4)
            {
                if (buffer[0] == 0x50 && buffer[1] == 0x4B && buffer[2] == 0x03 && buffer[3] == 0x04)
                {
                    return ArchiveFileType.ZIP;
                }
            }

RecursiveExtractor/MiniMagic.cs:211

These 'if' statements can be combined.

            if (bytesRead >= 7)
            {
                if (buffer[0] == 0x52 && buffer[1] == 0x61 && buffer[2] == 0x72 && buffer[3] == 0x21 && buffer[4] == 0x1A && buffer[5] == 0x07 && buffer[6] == 0x00)
                {
                    return ArchiveFileType.RAR;
                }
            }

RecursiveExtractor/Extractors/ArcExtractor.cs

RecursiveExtractor/Extractors/AceExtractor.cs

RecursiveExtractor.Tests/ExtractorTests/ExpectedNumFilesTests.cs

RecursiveExtractor/Extractors/ArjExtractor.cs

RecursiveExtractor.Tests/ExtractorTests/ExpectedNumFilesTests.cs

RecursiveExtractor/MiniMagic.cs

… 'using' statements for better resource management

gfs · 2026-02-06T23:36:11Z

/azp run

azure-pipelines · 2026-02-06T23:36:23Z

Azure Pipelines successfully started running 1 pipeline(s).

Copilot

Pull request overview

Copilot reviewed 11 out of 14 changed files in this pull request and generated 8 comments.

Comments suppressed due to low confidence (4)

RecursiveExtractor/Extractors/AceExtractor.cs:140

Same issue in the sync path: using var stream = aceReader.OpenEntryStream() is followed by { ... }, which won’t compile. Use a using (...) { ... } statement or a using var ...; declaration without a block.

                            using var stream = aceReader.OpenEntryStream()
                            {
                                var name = entry.Key?.Replace('/', Path.DirectorySeparatorChar);
                                if (string.IsNullOrEmpty(name))
                                {
                                    Logger.Debug(Extractor.ENTRY_MISSING_NAME_ERROR_MESSAGE_STRING, ArchiveFileType.ACE, fileEntry.FullPath);
                                    continue;
                                }
                                newFileEntry = new FileEntry(name, stream, fileEntry, false, entry.CreatedTime, entry.LastModifiedTime, entry.LastAccessedTime, memoryStreamCutoff: options.MemoryStreamCutoff);
                            }

RecursiveExtractor/Extractors/ArcExtractor.cs:142

Same issue in the sync path: using var stream = arcReader.OpenEntryStream() is written with a following { ... } block, which won’t compile. Use using (var stream = ...) { ... } or a using var ...; declaration without a block.

                            using var stream = arcReader.OpenEntryStream()
                            {
                                var name = entry.Key?.Replace('/', Path.DirectorySeparatorChar);
                                if (string.IsNullOrEmpty(name))
                                {
                                    Logger.Debug(Extractor.ENTRY_MISSING_NAME_ERROR_MESSAGE_STRING, ArchiveFileType.ARC, fileEntry.FullPath);
                                    continue;
                                }
                                newFileEntry = new FileEntry(name, stream, fileEntry, false, entry.CreatedTime, entry.LastModifiedTime, entry.LastAccessedTime, memoryStreamCutoff: options.MemoryStreamCutoff);
                            }

RecursiveExtractor/MiniMagic.cs:191

These 'if' statements can be combined.

            if (bytesRead >= 4)
            {
                if (buffer[0] == 0x50 && buffer[1] == 0x4B && buffer[2] == 0x03 && buffer[3] == 0x04)
                {
                    return ArchiveFileType.ZIP;
                }
            }

RecursiveExtractor/MiniMagic.cs:211

These 'if' statements can be combined.

            if (bytesRead >= 7)
            {
                if (buffer[0] == 0x52 && buffer[1] == 0x61 && buffer[2] == 0x72 && buffer[3] == 0x21 && buffer[4] == 0x1A && buffer[5] == 0x07 && buffer[6] == 0x00)
                {
                    return ArchiveFileType.RAR;
                }
            }

Copilot · 2026-02-06T23:44:27Z

RecursiveExtractor/Extractors/AceExtractor.cs

+                        }
+
+                        governor.CheckResourceGovernor(entry.Size);
+                        using var entryStream = aceReader.OpenEntryStream()


using var entryStream = aceReader.OpenEntryStream() is followed by a block ({ ... }), which is invalid syntax for a using-declaration and will not compile. Convert this to a using (...) { ... } statement or remove the braces and terminate the declaration with ;.

This issue also appears on line 131 of the same file.

Suggested change

using var entryStream = aceReader.OpenEntryStream()

using (var entryStream = aceReader.OpenEntryStream())

Copilot · 2026-02-06T23:44:28Z

RecursiveExtractor/Extractors/AceExtractor.cs

+            try
+            {
+                aceReader = AceReader.Open(fileEntry.Content, new ReaderOptions()
+                {
+                    LeaveStreamOpen = true
+                });


Reset fileEntry.Content.Position = 0 before calling AceReader.Open(...) (in both async and sync methods). This matches the established extractor pattern (e.g., Zip/GZip) and avoids failures when the stream position isn’t at the beginning.

Copilot · 2026-02-06T23:44:28Z

RecursiveExtractor.Tests/ExtractorTests/ExpectedNumFilesTests.cs

+            var resultsList = extractor.Extract(path, stream, GetExtractorOptions()).ToList();
+            Assert.DoesNotContain(resultsList, r => r.EntryStatus == FileEntryStatus.FailedArchive);
+            Assert.Equal(expectedNumFiles, resultsList.Count);
            stream.Close();


Redundant stream.Close() call: the stream is already disposed by the using var stream = new FileStream(...). Removing the explicit close reduces noise and avoids confusion about ownership/lifetime.

Suggested change

stream.Close();

Copilot · 2026-02-06T23:44:28Z

RecursiveExtractor/Extractors/ArcExtractor.cs

+                            continue;
+                        }
+
+                        using var entryStream = arcReader.OpenEntryStream()


using var entryStream = arcReader.OpenEntryStream() is followed by a block ({ ... }), which is invalid syntax for a using-declaration and will not compile. Change to a using (...) { ... } statement or end the declaration with ; and remove the braces.

This issue also appears on line 133 of the same file.

Suggested change

using var entryStream = arcReader.OpenEntryStream()

using (var entryStream = arcReader.OpenEntryStream())

Copilot · 2026-02-06T23:44:28Z

RecursiveExtractor/Extractors/ArjExtractor.cs

+                            using var stream = arjReader.OpenEntryStream()
+                            {
+                                var name = entry.Key?.Replace('/', Path.DirectorySeparatorChar);
+                                if (string.IsNullOrEmpty(name))
+                                {
+                                    Logger.Debug(Extractor.ENTRY_MISSING_NAME_ERROR_MESSAGE_STRING, ArchiveFileType.ARJ, fileEntry.FullPath);
+                                    continue;
+                                }
+                                newFileEntry = new FileEntry(name, stream, fileEntry, false, entry.CreatedTime, entry.LastModifiedTime, entry.LastAccessedTime, memoryStreamCutoff: options.MemoryStreamCutoff);
+                            }


using var stream = arjReader.OpenEntryStream() is written with a following block ({ ... }), which is invalid C# syntax for a using-declaration and will not compile. Use either a using (...) { ... } statement or a using var stream = ...; declaration (and remove the braces).

Copilot · 2026-02-06T23:44:28Z

RecursiveExtractor/Extractors/ArjExtractor.cs

+            try
+            {
+                arjReader = ArjReader.Open(fileEntry.Content, new ReaderOptions()
+                {
+                    LeaveStreamOpen = true
+                });


Before calling ArjReader.Open(...), reset fileEntry.Content.Position = 0 (and similarly in the sync Extract method). Most other extractors do this to avoid failures when the stream has been read/rewound elsewhere before extraction.

Copilot · 2026-02-06T23:44:29Z

RecursiveExtractor/Extractors/ArcExtractor.cs

+            try
+            {
+                arcReader = ArcReader.Open(fileEntry.Content, new ReaderOptions()
+                {
+                    LeaveStreamOpen = true
+                });


Consider resetting fileEntry.Content.Position = 0 before ArcReader.Open(...) in both ExtractAsync and Extract. Other extractors do this to ensure extraction works even if the stream has been partially read before the extractor runs.

Copilot · 2026-02-06T23:44:29Z

RecursiveExtractor/MiniMagic.cs

+            if (bytesRead >= 3)
+            {
+                if (buffer[0] == 0x42 && buffer[1] == 0x5A && buffer[2] == 0x68)
                {
-                    return ArchiveFileType.GZIP;
+                    return ArchiveFileType.BZIP2;
+                }
+            }


These 'if' statements can be combined.

This issue also appears in the following locations of the same file:

line 185

line 205

See below for a potential fix:

if (bytesRead >= 3 && buffer[0] == 0x42 && buffer[1] == 0x5A && buffer[2] == 0x68) { return ArchiveFileType.BZIP2; } if (bytesRead >= 4 && buffer[0] == 0x50 && buffer[1] == 0x4B && buffer[2] == 0x03 && buffer[3] == 0x04) { return ArchiveFileType.ZIP;

Initial plan

b3482f9

Copilot AI assigned Copilot and gfs Feb 6, 2026

Copilot started work on behalf of gfs February 6, 2026 18:12 View session

Copilot AI and others added 3 commits February 6, 2026 18:23

Add ARJ, ARC, ACE support and update agent instructions for NBGV deep…

5d90944

… clone requirement Co-authored-by: gfs <98900+gfs@users.noreply.github.com>

Fix ACE test data file CRC and restore nuget.config

e03db48

Co-authored-by: gfs <98900+gfs@users.noreply.github.com>

Improve ACE resource governor check to validate before decompression

4f47643

Co-authored-by: gfs <98900+gfs@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Reimplement arj and arc changes with cleaner commit history~~ Add ARJ, ARC, and ACE archive decompression support Feb 6, 2026

Copilot finished work on behalf of gfs February 6, 2026 18:37

Copilot AI requested a review from gfs February 6, 2026 18:37

gfs requested a review from Copilot February 6, 2026 18:42

Copilot started reviewing on behalf of gfs February 6, 2026 18:42 View session

gfs mentioned this pull request Feb 6, 2026

Add ARJ and ARC decompression support #186

Closed

9 tasks

Copilot AI reviewed Feb 6, 2026

View reviewed changes

RecursiveExtractor/MiniMagic.cs Outdated Show resolved Hide resolved

Copilot started work on behalf of gfs February 6, 2026 18:53 View session

Fix DetectFileType regression: read min(length, buffer) bytes and gat…

fba41e2

…e each signature check on bytes actually read Co-authored-by: gfs <98900+gfs@users.noreply.github.com>

Copilot finished work on behalf of gfs February 6, 2026 18:57

Copilot started work on behalf of gfs February 6, 2026 18:58 View session

Copilot finished work on behalf of gfs February 6, 2026 19:04

Copilot started work on behalf of gfs February 6, 2026 21:22 View session

Copilot finished work on behalf of gfs February 6, 2026 21:29

gfs requested a review from Copilot February 6, 2026 21:36

Copilot started reviewing on behalf of gfs February 6, 2026 21:37 View session

Copilot AI reviewed Feb 6, 2026

View reviewed changes

Refactor entry stream handling in ACE, ARC, and ARJ extractors to use…

755b17e

… 'using' statements for better resource management

gfs requested a review from Copilot February 6, 2026 23:35

Copilot started reviewing on behalf of gfs February 6, 2026 23:36 View session

Copilot AI reviewed Feb 6, 2026

View reviewed changes

	using var entryStream = aceReader.OpenEntryStream()
	using (var entryStream = aceReader.OpenEntryStream())

	using var entryStream = arcReader.OpenEntryStream()
	using (var entryStream = arcReader.OpenEntryStream())

Conversation

Copilot AI commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gfs commented Feb 6, 2026

Uh oh!

azure-pipelines bot commented Feb 6, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

gfs commented Feb 6, 2026

Uh oh!

Copilot AI commented Feb 6, 2026

Uh oh!

gfs commented Feb 6, 2026

Uh oh!

azure-pipelines bot commented Feb 6, 2026

Uh oh!

gfs commented Feb 6, 2026

Uh oh!

Copilot AI commented Feb 6, 2026

Uh oh!

gfs commented Feb 6, 2026

Uh oh!

azure-pipelines bot commented Feb 6, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gfs commented Feb 6, 2026

Uh oh!

azure-pipelines bot commented Feb 6, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Feb 6, 2026 •

edited

Loading